NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Energetic Variational Gaussian Process Regression

https://doi.org/10.1109/WSC63780.2024.10838889

Kang, Lulu; Cheng, Yuanxing; Wang, Yiwei; Liu, Chun (December 2024, IEEE)

Full Text Available
Fair Multivariate Adaptive Regression Splines for Ensuring Equity and Transparency

https://doi.org/10.1609/aaai.v38i20.30211

Haghighat, Parian; Gándara, Denisa; Kang, Lulu; Anahideh, Hadis (March 2024, Proceedings of the AAAI Conference on Artificial Intelligence)

Predictive analytics has been widely used in various domains, including education, to inform decision-making and improve outcomes. However, many predictive models are proprietary and inaccessible for evaluation or modification by researchers and practitioners, limiting their accountability and ethical design. Moreover, predictive models are often opaque and incomprehensible to the officials who use them, reducing their trust and utility. Furthermore, predictive models may introduce or exacerbate bias and inequity, as they have done in many sectors of society. Therefore, there is a need for transparent, interpretable, and fair predictive models that can be easily adopted and adapted by different stakeholders. In this paper, we propose a fair predictive model based on multivariate adaptive regression splines (MARS) that incorporates fairness measures in the learning process. MARS is a non-parametric regression model that performs feature selection, handles non-linear relationships, generates interpretable decision rules, and derives optimal splitting criteria on the variables. Specifically, we integrate fairness into the knot optimization algorithm and provide theoretical and empirical evidence of how it results in a fair knot placement. We apply our fairMARS model to real-world data and demonstrate its effectiveness in terms of accuracy and equity. Our paper contributes to the advancement of responsible and ethical predictive analytics for social good.
more » « less
Full Text Available
Learned mappings for targeted free energy perturbation between peptide conformations

https://doi.org/10.1063/5.0164662

Willow, Soohaeng Yoo; Kang, Lulu; Minh, David_D L (September 2023, The Journal of Chemical Physics)

Targeted free energy perturbation uses an invertible mapping to promote configuration space overlap and the convergence of free energy estimates. However, developing suitable mappings can be challenging. Wirnsberger et al. [J. Chem. Phys. 153, 144112 (2020)] demonstrated the use of machine learning to train deep neural networks that map between Boltzmann distributions for different thermodynamic states. Here, we adapt their approach to the free energy differences of a flexible bonded molecule, deca-alanine, with harmonic biases and different spring centers. When the neural network is trained until “early stopping”—when the loss value of the test set increases—we calculate accurate free energy differences between thermodynamic states with spring centers separated by 1 Å and sometimes 2 Å. For more distant thermodynamic states, the mapping does not produce structures representative of the target state, and the method does not reproduce reference calculations.
more » « less
Full Text Available
Bayesian D-Optimal Design of Experiments with Quantitative and Qualitative Responses

https://doi.org/10.51387/23-NEJSDS30

Kang, Lulu; Deng, Xinwei; Jin, Ran (April 2023, The New England Journal of Statistics in Data Science)

Systems with both quantitative and qualitative responses are widely encountered in many applications. Design of experiment methods are needed when experiments are conducted to study such systems. Classic experimental design methods are unsuitable here because they often focus on one type of response. In this paper, we develop a Bayesian D-optimal design method for experiments with one continuous and one binary response. Both noninformative and conjugate informative prior distributions on the unknown parameters are considered. The proposed design criterion has meaningful interpretations regarding the D-optimality for the models for both types of responses. An efficient point-exchange search algorithm is developed to construct the local D-optimal designs for given parameter values. Global D-optimal designs are obtained by accumulating the frequencies of the design points in local D-optimal designs, where the parameters are sampled from the prior distributions. The performances of the proposed methods are evaluated through two examples.
more » « less
Full Text Available
Sampling constrained continuous probability distributions: A review

https://doi.org/10.1002/wics.1608

Lan, Shiwei; Kang, Lulu (February 2023, WIREs Computational Statistics)

The problem of sampling constrained continuous distributions has frequently appeared in many machine/statistical learning models. Many Markov Chain Monte Carlo (MCMC) sampling methods have been adapted to handle different types of constraints on random variables. Among these methods, Hamilton Monte Carlo (HMC) and the related approaches have shown significant advantages in terms of computational efficiency compared with other counterparts. In this article, we first review HMC and some extended sampling methods, and then we concretely explain three constrained HMC-based sampling methods, reflection, reformulation, and spherical HMC. For illustration, we apply these methods to solve three well-known constrained sampling problems, truncated multivariate normal distributions, Bayesian regularized regression, and nonparametric density estimation. In this review, we also connect constrained sampling with another similar problem in the statistical design of experiments with constrained design space.
more » « less
Full Text Available
A Maximin Φp-Efficient Design for Multivariate Generalized Linear Models

https://doi.org/10.5705/ss.202020.0278

Li, Yiou; Kang, Lulu; Deng, Xinwei (January 2023, Statistica Sinica)
null (Ed.)
Full Text Available
Locally Optimal Design for A/B Tests in the Presence of Covariates and Network Dependence

https://doi.org/10.1080/00401706.2022.2046169

Zhang, Qiong; Kang, Lulu (July 2022, Technometrics)

Full Text Available
A generative approach to modeling data with quantitative and qualitative responses

https://doi.org/10.1016/j.jmva.2022.104952

Kang, Xiaoning; Kang, Lulu; Chen, Wei; Deng, Xinwei (July 2022, Journal of Multivariate Analysis)

Full Text Available
Fair and diverse allocation of scarce resources

https://doi.org/10.1016/j.seps.2021.101193

Anahideh, Hadis; Kang, Lulu; Nezami, Nazanin (March 2022, Socio-Economic Planning Sciences)

Full Text Available
Gaussian Process Assisted Active Learning of Physical Laws

https://doi.org/10.1080/00401706.2020.1817790

Chen, Jiuhai; Kang, Lulu; Lin, Guang (July 2021, Technometrics)
null (Ed.)
Full Text Available

« Prev Next »

Search for: All records